Individual Poster Page

See copyright notice at the bottom of this page.

Offensive Performance, Omitted Variables, and the Value of Speed in Baseball (November 6, 2003)

Posted 12:26 p.m., November 7, 2003 (#3) - Ted T
Two notes for those of you who decide to read the paper:

(1) Remember that it's an econometrics paper about baseball, not a baseball paper about econometrics. Because of the audience it's written for, some of the stuff in there is going to seem very basic to readers on this site. The main point of the paper is methodological. For many years, I took it on faith that regression on team statistics would give the same estimates as Pete Palmer got from his simulation; but when I read Albert and Bennett's book, I realized it wasn't true.

(2) It's certainly true that the equilibrium success percentages for win-probability teams jump around; but, as tangotiger notes, they stay in the range from the low 50s to the high 80s. A naive regression estimator such as the one in Albert and Bennett gives estimates that make no sense in the context of this. But, if we use this as a baseline for what the relative SB and CS coefficients must look like, we can use it to evaluate whether we've adequately instrumented for speed. There's tons of heterogeneity in this data, both in situations (as noted) and across players -- but the fact that most equilibrium success percentages cluster in the 60s and 70s means we still should -- and it appears we do -- get approximate aggregation.

So, I see the paper as being exactly the opposite of what Nick S got out of it: the relative values of Palmer's SB and CS weights (the original ones from the simulation) make sense if we bear in mind that basestealing is an elective play, and we can get the same result from "real" data if we control for quality of baserunners appropriately. The fact that passing from expected runs to winning percentage doesn't make a huge difference in strategy is one of the things that makes the regression work OK.

By the way, I'm interested in ideas for other instruments I might be able to use to get the SB3 and CS3 coefficients to make more sense; it'd really improve the paper. I haven't had any success yet, but am going to try various flavors of bases advanced on hits next. I wish I could use infield singles, but the Retrosheet data isn't good enough for that for most years. :(

Offensive Performance, Omitted Variables, and the Value of Speed in Baseball (November 6, 2003)

Discussion Thread

Posted 8:55 a.m., November 10, 2003 (#13) - Ted T
J. Cross: I think you have the science backwards here. The paper starts with a hypothesis that stolen base behavior is occurring according to some optimal scheme. There's a testable implication of that optimality: that the estimated net contribution of the marginal attempt is zero. The paper looks at the data, and finds that we cannot reject the hypothesis. That's a *much* weaker statement than *accepting* the hypothesis that there's rationality going on, which is what you ask. But it's the strongest statement science is able to make -- we can only reject or fail to reject hypotheses.

Of course, there are lots of other things that could go on and still give the result the paper has. The ideas you list are certainly relevant and could also well be going on. I thought about the agency question (stealing too much to pad numbers) early on in this project, but couldn't figure out a coherent model that had a testable implication. I still think it's an interesting area to look at.

But, overall, these non-optimalities, if they exist (and they probably do) couldn't be too systematic, or the resulting parameter estimates we get from the data wouldn't be what they are. So "on average" in some sense, the data suggest that overall basestealing behavior is likely not to be too suboptimal.

Offensive Performance, Omitted Variables, and the Value of Speed in Baseball (November 6, 2003)

Discussion Thread

Posted 9:02 a.m., November 10, 2003 (#14) - Ted T
tango - Are you basing your assertion on personal opinion or science?

The game-theory based paper I wrote a few months ago (which if I recall correctly you've seen, since I believe you had some comments for me) implies that the equilibrium attempt frequency goes up very rapidly for a marginal change in talent for the best basestealers. The differential equation looks like d(attempt frequency)/d(talent) is proportional to the square of attempt frequency, which means it's going up even faster than an exponential (the solution to the equation would be an exponential if it were the attempt frequency itself, not the square).

So theory expects that the handful of the best basestealers should steal noticeably more than the second-tier guys, which is what we observe in the data. In fact, the game in that paper does a pretty good job of predicting the cross-sectional distribution of both attempt frequency (very skewed) and success percentage (not skewed) in the population of players.

I'm not taking a position on Rickey per se, because figuring the "correct" steal frequency would require direct observation of "talent", which we don't observe. But both qualitatively and quantitatively, Rickey's numbers aren't out of whack with a plausible model of stolen base strategy for an elite baserunner.

Offensive Performance, Omitted Variables, and the Value of Speed in Baseball (November 6, 2003)

Discussion Thread

Posted 11:13 a.m., November 10, 2003 (#16) - Ted T
tango -

The last public draft's on my website at

http://econweb.tamu.edu/turocy/papers/sbecon.htm

I'm redrafting a final version now. (I was in the process of working on it when I factored out the paper we're discussing here.) I don't think the version on the web has the comparative statics calculation I cite, but I can send that to your email if you'd like.

Effect of SB attempt on batter (November 10, 2003)

Discussion Thread

Posted 3:19 p.m., November 10, 2003 (#4) - Ted T
What's definitely true in the data is that the change in batter performance with runner on first only versus bases empty is largest (by a significant amount) in aggregate for batters in the #2 slot, followed by the #3 slot. For slots #4 through #1, the change is not significantly different from constant (but positive).

My pet interpretation of this data is that we must be seeing some difference in defensive conduct here related to the identity of the runners on first in these situations.

Note that of course runner on first only is basically "conditional on not attempting a SB". As is well-known, conditional on a SB attempt, performance does decrease.

The Problem With "Total Clutch" Hitting Statistics (December 1, 2003)

Discussion Thread

Posted 10:08 p.m., December 2, 2003 (#30) - Ted T
Yes, only looking at subsamples (such as the 110-119 OPS guys) is methodologically bogus.

Here's why the R^2 plummets. In selecting on a range like this, it cuts down the variation due to OPS in the population. However, it *doesn't* cut down the other variation in that sample. So R^2 will certainly fall. To see this, consider the polar case: if we only selected guys with OPS=115, say, our R^2 would be zero -- we couldn't explain *any* variation using OPS!

I also don't understand this talk about "outliers". In the scatterplot I looked at based on Cyril's data, the guys at the extremes aren't far off the regression line. They aren't outliers. And it's not true that the Bondses of the world carry heavier weight, because least squares is on *differences*, not absolute values.

Cubs create job to analyze numbers (December 4, 2003)

Discussion Thread

Posted 1:42 p.m., December 4, 2003 (#3) - Ted T
I know Chuck decently... he's been the assistant PR guy for the Cubs for a while. His primary responsibility is data gathering and dissemination, and he's good at that. He doesn't do any *analysis* though. So far as I can tell, it *is* just a title change, and he's probably going to do more or less the same thing as before.

I guess it's good for an organization to realize that gathering and organizing data is an essential part of player evaluation. Of course, knowing how to *interpret* the data is kinda important too -- unfortunately, that seems to be what's going to be missing here.

List of top basestealers (December 9, 2003)

Discussion Thread

Posted 9:23 p.m., December 10, 2003 (#1) - Ted T
These numbers don't measure the value of stolen base activity; they're primarily measuring the active randomization inherent in optimal stolen base strategy.

Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.